coordination mechanism
Learning to Lead: Incentivizing Strategic Agents in the Dark
Wu, Yuchen, Zhong, Xinyi, Yang, Zhuoran
The principal-agent model (Ross, 1973; Grossman and Hart, 1992; Smith, 2004; Laffont and Martimort, 2009) is a fundamental framework for understanding decision-making processes with misaligned incentives and information asymmetry, with wide applications across various disciplines such as economics, finance, and computer science (Ratliff et al., 2018; Kamenica, 2012). In this model, the principal represents an entity such as a service provider, a policy maker, or a firm, whose objective is to maximize certain system-level outcomes, such as revenue, social welfare, or efficiency. On the other hand, an agent, who could be a customer, an employee, or an individual participant, aims to optimize his utility based on his private preferences or information, which is not directly observable by the principal. To induce the optimal outcomes, the principal designs and commits to a mechanism, which could be a contract, an incentive scheme, or a policy, that aligns the agent's incentives with the principal's objectives. The optimal mechanism and the agent's optimal strategy against it constitute the equilibrium of the principal-agent model, in certain settings also known as the Stackelberg equilibrium (Stackelberg, 1934, 2010).
An Adversary-Resistant Multi-Agent LLM System via Credibility Scoring
Ebrahimi, Sana, Dehghankar, Mohsen, Asudeh, Abolfazl
While multi-agent LLM systems show strong capabilities in various domains, they are highly vulnerable to adversarial and low-performing agents. To resolve this issue, in this paper, we introduce a general and adversary-resistant multi-agent LLM framework based on credibility scoring. We model the collaborative query-answering process as an iterative game, where the agents communicate and contribute to a final system output. Our system associates a credibility score that is used when aggregating the team outputs. The credibility scores are learned gradually based on the past contributions of each agent in query answering. Our experiments across multiple tasks and settings demonstrate our system's effectiveness in mitigating adversarial influence and enhancing the resilience of multi-agent cooperation, even in the adversary-majority settings.
MochiSwarm: A testbed for robotic blimps in realistic environments
Xu, Jiawei, Vu, Thong, D'Antonio, Diego S., Saldaรฑa, David
Testing aerial robots in tasks such as pickup-and-delivery and surveillance significantly benefits from high energy efficiency and scalability of the deployed robotic system. This paper presents MochiSwarm, an open-source testbed of light-weight robotic blimps, ready for multi-robot operation without external localization. We introduce the system design in hardware, software, and perception, which capitalizes on modularity, low cost, and light weight. The hardware allows for rapid modification, which enables the integration of additional sensors to enhance autonomy for different scenarios. The software framework supports different actuation models and communication between the base station and multiple blimps. The detachable perception module allows independent blimps to perform tasks that involve detection and autonomous actuation. We showcase a differential-drive module as an example, of which the autonomy is enabled by visual servoing using the perception module. A case study of pickup-and-delivery tasks with up to 12 blimps highlights the autonomy of the MochiSwarm without external infrastructures.
Neural Coordination and Capacity Control for Inventory Management
Eisenach, Carson, Ghai, Udaya, Madeka, Dhruv, Torkkola, Kari, Foster, Dean, Kakade, Sham
This paper addresses the capacitated periodic review inventory control problem, focusing on a retailer managing multiple products with limited shared resources, such as storage or inbound labor at a facility. Specifically, this paper is motivated by the questions of (1) what does it mean to backtest a capacity control mechanism, (2) can we devise and backtest a capacity control mechanism that is compatible with recent advances in deep reinforcement learning for inventory management? First, because we only have a single historic sample path of Amazon's capacity limits, we propose a method that samples from a distribution of possible constraint paths covering a space of real-world scenarios. This novel approach allows for more robust and realistic testing of inventory management strategies. Second, we extend the exo-IDP (Exogenous Decision Process) formulation of Madeka et al. 2022 to capacitated periodic review inventory control problems and show that certain capacitated control problems are no harder than supervised learning. Third, we introduce a `neural coordinator', designed to produce forecasts of capacity prices, guiding the system to adhere to target constraints in place of a traditional model predictive controller. Finally, we apply a modified DirectBackprop algorithm for learning a deep RL buying policy and a training the neural coordinator. Our methodology is evaluated through large-scale backtests, demonstrating RL buying policies with a neural coordinator outperforms classic baselines both in terms of cumulative discounted reward and capacity adherence (we see improvements of up to 50% in some cases).
Use of explicit replies as coordination mechanisms in online student debate
Ferreira-Saraiva, Bruno D., Matos-Carvalho, Joao P., Pita, Manuel
People in conversation entrain their linguistic behaviours through spontaneous alignment mechanisms [7] - both in face-to-face and computer-mediated communication (CMC) [8]. In CMC, one of the mechanisms through which linguistic entrainment happens is through explicit replies. Indeed, the use of explicit replies influences the structure of conversations, favouring the formation of reply-trees typically delineated by topic shifts [5]. The interpersonal coordination mechanisms realized by how actors address each other have been studied using a probabilistic framework proposed by David Gibson [2,3]. Other recent approaches use computational methods and information theory to quantify changes in text. We explore coordination mechanisms concerned with some of the roles utterances play in dialogues - specifically in explicit replies. We identify these roles by finding community structure in the conversation's vocabulary using a non-parametric, hierarchical topic model. Some conversations may always stay on the ground, remaining at the level of general introductory chatter. Some others may develop a specific sub-topic in significant depth and detail. Even others may jump between general chatter, out-of-topic remarks and people agreeing or disagreeing without further elaboration.
Retrospective and Prospective Mixture-of-Generators for Task-oriented Dialogue Response Generation
Pei, Jiahuan, Ren, Pengjie, Monz, Christof, de Rijke, Maarten
Dialogue response generation (DRG) is a critical component of task-oriented dialogue systems (TDSs). Its purpose is to generate proper natural language responses given some context, e.g., historical utterances, system states, etc. State-of-the-art work focuses on how to better tackle DRG in an end-to-end way. Typically, such studies assume that each token is drawn from a single distribution over the output vocabulary, which may not always be optimal. Responses vary greatly with different intents, e.g., domains, system actions. We propose a novel mixture-of-generators network (MoGNet) for DRG, where we assume that each token of a response is drawn from a mixture of distributions. MoGNet consists of a chair generator and several expert generators. Each expert is specialized for DRG w.r.t. a particular intent. The chair coordinates multiple experts and combines the output they have generated to produce more appropriate responses. We propose two strategies to help the chair make better decisions, namely, a retrospective mixture-of-generators (RMoG) and prospective mixture-of-generators (PMoG). The former only considers the historical expert-generated responses until the current time step while the latter also considers possible expert-generated responses in the future by encouraging exploration. In order to differentiate experts, we also devise a global-and-local (GL) learning scheme that forces each expert to be specialized towards a particular intent using a local loss and trains the chair and all experts to coordinate using a global loss. We carry out extensive experiments on the MultiWOZ benchmark dataset. MoGNet significantly outperforms state-of-the-art methods in terms of both automatic and human evaluations, demonstrating its effectiveness for DRG.
Schedule-Driven Coordination for Real-Time Traffic Network Control
Xie, Xiao-Feng (Carnegie Mellon University) | Smith, Stephen F. (Carnegie Mellon University) | Barlow, Gregory J. (Carnegie Mellon University)
Real-time optimization of the dynamic flow of vehicle traffic through a network of signalized intersections is an important practical problem. In this paper, we take a decentralized, schedule-driven coordination approach to address the challenge of achieving scalable network-wide optimization. To be locally effective, each intersection is controlled independently by an on-line scheduling agent. At each decision point, an agent constructs a schedule that optimizes movement of the observable traffic through the intersection, and uses this schedule to determine the best control action to take over the current look-ahead horizon. Decentralized coordination mechanisms, limited to interaction among direct neighbors to ensure scalability, are then layered on top of these asynchronously operating scheduling agents to promote overall performance. As a basic protocol, each agent queries for newly planned output flows from its upstream neighbors to obtain an optimistic projection of future demand. This projection may incorporate non-local influence from indirect neighbors depending on horizon length. Two additional mechanisms are then introduced to dampen ``nervousness'' and dynamic instability in the network, by adjusting locally determined schedules to better align with those of neighbors. We present simulation results on two traffic networks of tightly-coupled intersections that demonstrate the ability of our approach to establish traffic flows with lower average vehicle wait times than both a simple isolated control strategy and other contemporary coordinated control strategies that use moving average forecast or traditional offset calculation.
Collaboration and Coordination in Secondary Networks for Opportunistic Spectrum Access
Jouini, Wassim, Di Felice, Marco, Bononi, Luciano, Moy, Christophe
In this paper, we address the general case of a coordinated secondary network willing to exploit communication opportunities left vacant by a licensed primary network. Since secondary users (SU) usually have no prior knowledge on the environment, they need to learn the availability of each channel through sensing techniques, which however can be prone to detection errors. We argue that cooperation among secondary users can enable efficient learning and coordination mechanisms in order to maximize the spectrum exploitation by SUs, while minimizing the impact on the primary network. To this goal, we provide three novel contributions in this paper. First, we formulate the spectrum selection in secondary networks as an instance of the Multi-Armed Bandit (MAB) problem, and we extend the analysis to the collaboration learning case, in which each SU learns the spectrum occupation, and shares this information with other SUs. We show that collaboration among SUs can mitigate the impact of sensing errors on system performance, and improve the convergence of the learning process to the optimal solution. Second, we integrate the learning algorithms with two collaboration techniques based on modified versions of the Hungarian algorithm and of the Round Robin algorithm that allows reducing the interference among SUs. Third, we derive fundamental limits to the performance of cooperative learning algorithms based on Upper Confidence Bound (UCB) policies in a symmetric scenario where all SU have the same perception of the quality of the resources. Extensive simulation results confirm the effectiveness of our joint learning-collaboration algorithm in protecting the operations of Primary Users (PUs), while maximizing the performance of SUs.
Towards a Model-Centric Cognitive Architecture for Service Robots
Steck, Andreas (University of Applied Sciences Ulm)
The development of service robots has gained more and more attention over the last years. Advanced robots have to cope with many different situations and contingencies while executing concurrent and interruptable complex tasks. To manage the sheer variety of different execution variants the robot has to decide at run-time for the most appropriate behavior to execute. That requires task coordination mechanisms that provide the flexibility to adapt at run-time and allow to balance between alternatives.
Multiagent Systems: Challenges and Opportunities for Decision-Theoretic Planning
In this article, I describe several challenges facing the integration of two distinct lines of AI research: (1) decision-theoretic planning (DTP) and (2) multiagent systems. Both areas (especially the second) are attracting considerable interest, but work in multiagent systems often assumes either classical planning models or prespecified economic valuations on the part of the agents in question. By integrating models of DTP in multiagent systems research, more sophisticated multiagent planning scenarios can be accommodated, at the same time explaining precisely how agents determine their valuations for different sources or activities. I discuss several research challenges that emerge from this integration, involving the development of coordination protocols, the reasoning about lack of coordination, and the predicting of behavior in markets. I also briefly mention some opportunities afforded planning agents in multiagent settings and how these might be addressed.